NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 3594 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 2.1 KB

Path: news.eunet.fi!fipnet!kone!jsaarinen Newsgroups: comp.sys.amiga.programmer X-NewsReader: IntuiNews 1.2b (31.7.94) References: <38232464@kone.fipnet.fi> <4ga21v$lsk@brachio.zrz.TU-Berlin.DE> From: "Jyrki Saarinen" <jsaarinen@kone.fipnet.fi> Date: Wed, 21 Feb 96 18:00:40 UT Comments: Illegal date header - new date added by quicknews X-Original-Date: Wed, 21 Feb 96 13:01:37 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: binary Subject: Re: Texture/Gouraud innerloop speedtests Message-ID: <38232562@kone.fipnet.fi> > >Ok, I did a little research. My CPU is a 40MHz 68040, > >a Warp Engine with a very fast memory system, maybe > >this is the reason I did not gain any speed even if > >I turned the data cache and thus data burst off, > >with data burst everything was about 50% slower. > > Not very surprising! Data burst means that whenever > a cache-miss occurs the CPU loads 4 longwords around > the mem area where the data to be fetched is. For a > tmapping loop this means that for almost any pixel that > is fetched from the texture the CPU keeps the bus busy > for 4 mem cycles! ;) I said 50% slower when I switched the data cache OFF. > >So the frame rates were for a 320x256 screen: > >Texture/Gouraud/Shading table, 64k aligned: ~43 fps > >Plain Texture, 64k aligned: ~67 fps > > fps? Are these figures for the mere repetition (320*256 times) > of the innerloop? Yep. > > move.b (a3,d0.l),d1 > > move.b (a4,d1.l),(a0)+ > [...] > > dbf d7,poly > > rts > > If I understand your problem right you wonder why the > two version are almost equal in terms of speed? The scheduling > is not optimal in both versions, you use the data that you > fetch in the next instruction. If you have read my posting .. I said I could not speed up the "normal index version" of the routine, I tried all the possible instruction combinations. Besides, the 64k-aligned routine is ~20% faster. And it is properly scheduled at least on the 040, I could gain about ~10-15% by changing instructions to that order where they are now. -- _ a Stellar programmer _ // "Amiga - back for the future" \X/